Abstract: MAP Structured Output Prediction by Sampling
نویسندگان
چکیده
MAP Structured Output Prediction by Sampling Shankar Vembu, Thomas Gärtner, and Mario Boley [email protected] Fraunhofer IAIS, Schloß Birlinghoven, 53754 Sankt Augustin, Germany We consider maximum a posteriori parameter estimation for structured output prediction with exponential family models. In this setting the main difficulty lies in the computation of the partition function and the first-order moment of the sufficient statistics. We consider the case that efficient algorithms for exact uniform sampling from the output space exist. This assumption is orthogonal to the typical assumptions made in structured output learning. It holds, in particular, for the highly relevant problem of sampling potent drugs. Under our uniform sampling assumption we show that exactly computing the partition function is intractable (Section 2) but it can be approximated efficiently (Section 3). Furthermore, we show that also the first-order moment of the sufficient statistics can be approximated (Section 4) and that we can sample according to the estimated distribution (Section 5). After a few simple application settings (Section 6), we discuss related and future work (Section 7). 1. Preliminaries and Problems We use [[n]] to denote {1, . . . , n}. Let X and Y be the input and the output space, respectively, where Y is parameterised by some finite alphabet Σ. For instance, Y can consist of strings, trees, or graphs over of Σ. Let {xi, yi}i∈[[m]] ⊆ X × Y be a set of observations. Our goal is to find θ as the maximum a posteriori parameters of the conditional exponential family model: p(y | x, θ) = exp(〈φ(x, y), θ〉)/Z(θ|x) , where φ(x, y) are the joint sufficient statistics of x and y, and Z(θ|x) = ∑ y∈Y exp(〈φ(x, y), θ〉) is the partition function. Imposing a normal prior on θ, this leads to minimising the following function:
منابع مشابه
Exact and Approximate Inference for Annotating Graphs with Structural SVMs
Training processes of structured prediction models such as structural SVMs involve frequent computations of the maximum-aposteriori (MAP) prediction given a parameterized model. For specific output structures such as sequences or trees, MAP estimates can be computed efficiently by dynamic programming algorithms such as the Viterbi algorithm and the CKY parser. However, when the output structure...
متن کاملProbabilistic Structured Predictors
We consider MAP estimators for structured prediction with exponential family models. In particular, we concentrate on the case that efficient algorithms for uniform sampling from the output space exist. We show that under this assumption (i) exact computation of the partition function remains a hard problem, and (ii) the partition function and the gradient of the log partition function can be a...
متن کاملVariance Reduction for Structured Prediction with Bandit Feedback
We present BanditLOLS, an algorithm for learning to make joint predictions from bandit feedback. The learner repeatedly predicts a sequence of actions, corresponding to either a structured output or control behavior, and observes feedback for that single output and no others. To address this limited feedback, we design a structured cost-estimation strategy for predicting the costs of some unobs...
متن کاملCanonical Correlation Inference for Mapping Abstract Scenes to Text
We describe a technique for structured prediction, based on canonical correlation analysis. Our learning algorithm finds two projections for the input and the output spaces that aim at projecting a given input and its correct output into points close to each other. We demonstrate our technique on a language-vision problem, namely the problem of giving a textual description to an “abstract scene.”
متن کاملLearning Where to Sample in Structured Prediction
In structured prediction, most inference algorithms allocate a homogeneous amount of computation to all parts of the output, which can be wasteful when different parts vary widely in terms of difficulty. In this paper, we propose a heterogeneous approach that dynamically allocates computation to the different parts. Given a pre-trained model, we tune its inference algorithm (a sampler) to incre...
متن کامل